Crowd Prefers the Middle Path: A New IAA Metric for Crowdsourcing Reveals Turker Biases in Query Segmentation

نویسندگان

  • Rohan Ramanath
  • Monojit Choudhury
  • Kalika Bali
  • Rishiraj Saha Roy
چکیده

Query segmentation, like text chunking, is the first step towards query understanding. In this study, we explore the effectiveness of crowdsourcing for this task. Through carefully designed control experiments and Inter Annotator Agreement metrics for analysis of experimental data, we show that crowdsourcing may not be a suitable approach for query segmentation because the crowd seems to have a very strong bias towards dividing the query into roughly equal (often only two) parts. Similarly, in the case of hierarchical or nested segmentation, turkers have a strong preference towards balanced binary trees.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Supplementary Material for Prefers the Middle Path: a New Iaa Metric for Crowdsourcing Reveals Turker Biases

Segmentation annotations of the Q500, QG500, Q700, S300 and QRand datasets [1] are contained in the accompanying folder named “Datasets”. All the data files are released in JSON1 format (similar to XML) in order to allow easy interoperability between the data and code. The naming convention used is 〈dataset name〉 flat.json for flat segmentation and 〈dataset name〉 nested.json for nested segmenta...

متن کامل

Perform Three Data Mining Tasks with Crowdsourcing Process

For data mining studies, because of the complexity of doing feature selection process in tasks by hand, we need to send some of labeling to the workers with crowdsourcing activities. The process of outsourcing data mining tasks to users is often handled by software systems without enough knowledge of the age or geography of the users' residence. Uncertainty about the performance of virtual user...

متن کامل

Tool Diversity as a Means of Improving Aggregate Crowd Performance on an Object Segmentation Task

Crowdsourcing is a common means of collecting training data, such as image segmentations, for many computer vision applications. However, designing accurate crowd-powered image segmentation systems is challenging because defining the boundaries of an object in an image requires considerable fine motor skills and hand-eye coordination that leads to some level of errors from every participant. Ty...

متن کامل

Crowdsourcing for Robustness in Web Search

Search systems are typically evaluated by averaging an effectiveness measure over a set of queries. However, this method does not capture the the robustness of the retrieval approach, as measured by its variability across queries. Robustness can be a critical retrieval property, especially in settings such as commercial search engines that must build user trust and maintain brand quality. This ...

متن کامل

Where To: Crowd-Aided Path Selection

With the widespread use of geo-positioning services (GPS), GPSbased navigation systems have become ever more of an integral part of our daily lives. GPS-based navigation systems usually suggest multiple paths for any given pair of source and target, leaving users perplexed when trying to select the best one among them, namely the problem of best path selection. Too many suggested paths may jeop...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013